Skip to content

Conversation

@MichaelChirico
Copy link
Member

Closes #5948

Turns out, this code in support of #2273 (#5224) is not needed -- I don't notice any difference before/after this change and we pass the related test:

if (loaded[["sf"]]) { #2273
DT = as.data.table(st_read(system.file("shape/nc.shp", package = "sf"), quiet=TRUE))
test(15, DT[1:3, .(NAME, FIPS, geometry)], output="Ashe.*-81.4.*Surry.*-80.4")
dsf = sf::st_as_sf(data.table(x=1:10, y=1:10, s=sample(1:2, 10, TRUE)), coords=1:2)
test(16, split(dsf, dsf$s), list(`1` = dsf[dsf$s == 1, ], `2` = dsf[dsf$s == 2, ]))
}

It's possible our testing is just not extensive enough.

@github-actions
Copy link

github-actions bot commented Dec 6, 2024

No obvious timing issues in HEAD=vctrs-print
Comparison Plot

Generated via commit 7253358

Download link for the artifact containing the test results: ↓ atime-results.zip

Task Duration
R setup and installing dependencies 4 minutes and 53 seconds
Installing different package versions 9 minutes and 34 seconds
Running and plotting the test cases 2 minutes and 35 seconds

@MichaelChirico
Copy link
Member Author

This test fails:

registerS3method("format", "foo2130", function(x, ...) rep("All hail foo",length(x)))
test(2130.15, print(DT), output="All hail foo")  # e.g. sf:::format.sfc rather than sf:::format.sfg on each item

My sense is that if we only fail a toy example, we should just break it. In general I am thinking the better solution here is to add format_col and/or format_list_item methods as needed.

I think given the potential for breaking change, it's best to save this PR for 1.18.0.

@MichaelChirico MichaelChirico added this to the 1.18.0 milestone Dec 6, 2024
@MichaelChirico MichaelChirico requested a review from aitap March 4, 2025 20:21
aitap added 2 commits March 5, 2025 13:00
'sf' is one of the few packages that uses a list column in a data.frame
with a correctly working format() method. Since we disregard such
methods now, test that the correct behaviour still happens thanks to the
format() methods of the individual list items.
Copy link
Member

@aitap aitap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's compare with other data.frames.

  • tibble refuses classed lists that don't come from their own packages:
> tibble(a = structure(1, class = 'foo2130'))
# A tibble: 1 × 1
  a
  <foo2130>
1 All hail foo
> tibble(a = structure(list(1), class = 'foo2130'))
Error in `tibble()`:
! All columns in a tibble must be vectors.
✖ Column `a` is a `foo2130` object.
Run `rlang::last_trace()` to see where the error occurred.
> tibble(a = list(structure(1, class = 'foo2130')))
# A tibble: 1 × 1
  a
  <list>
1 <foo2130 [1]>
> tibble(a = 1, b = list(mtcars))
# A tibble: 1 × 2
      a b
  <dbl> <list>
1     1 <df [32 × 11]>
> tibble(a = 1, b = list_of(mtcars))
# A tibble: 1 × 2
      a               b
  <dbl> <list<df[,11]>>
1     1       [32 × 11]
  • data.frame is consistent in saying that format(<data.frame>) is whatever format(...) returns for the individual columns, but doesn't support list columns that well. By default list elements are converted to data.frame columns. Constructing one with a list verbatim requires I(), which overrides format. A list inserted manually into an existing data.frame will be formatted using the usual format method:
> data.frame(a = 1, b = list_of(mtcars)) # NB: uses as.data.frame.vctrs_vctr to make the list into a column
# omitted: similar to current data.table behaviour, i.e., formats the whole `mtcars`
> x <- data.frame(a = 1)
> x$b <- list(mtcars)
> x
# same
> data.frame(a = 1, b = I(list(mtcars))) # stores list as is but with format() overriden
  a            b
1 1 c(21, 21....
> data.frame(a = structure(list(1), class = 'foo2130')) # tries to convert a list to data.frame
Error in as.data.frame.default(x[[i]], optional = TRUE, stringsAsFactors = stringsAsFactors) :
  cannot coerce class ‘"foo2130"’ to a data.frame
> traceback()
4: stop(gettextf("cannot coerce class %s to a data.frame", sQuote(deparse(class(x))[1L])),
       domain = NA)
3: as.data.frame.default(x[[i]], optional = TRUE, stringsAsFactors = stringsAsFactors)
2: as.data.frame(x[[i]], optional = TRUE, stringsAsFactors = stringsAsFactors)
1: data.frame(a = structure(list(1), class = "foo2130"))
> x <- data.frame(a = 1)
> x$b <- structure(list(1), class = 'foo2130') # uses format()
> x
  a            b
1 1 All hail foo
> data.frame(a = I(structure(list(1), class = 'foo2130'))) # uses format.AsIs()
  a
1 1
  • DataFrame allows classed lists but expands unclassed lists or data.frames into columns. Seems to ignore format methods altogether:
> DataFrame(a = structure(list(1,10), class = 'foo2130'))
DataFrame with 2 rows and 1 column
          a
  <foo2130>
1         1
2        10
> DataFrame(a = list(1,2))
DataFrame with 1 row and 2 columns
       a.X1      a.X2
  <numeric> <numeric>
1         1         2
> DataFrame(a = list(mtcars))
DataFrame with 32 rows and 11 columns
                      a.mpg     a.cyl    a.disp      a.hp    a.drat      a.wt    a.qsec      a.vs      a.am    a.gear
                  <numeric> <numeric> <numeric> <numeric> <numeric> <numeric> <numeric> <numeric> <numeric> <numeric>
Mazda RX4              21.0         6       160       110      3.90     2.620     16.46         0         1         4
Mazda RX4 Wag          21.0         6       160       110      3.90     2.875     17.02         0         1         4
# omitted
> DataFrame(a = structure(1, class = 'foo2130'))
DataFrame with 1 row and 1 column
          a
  <foo2130>
1         1
  • data.table formats lists specially by default, but defers to a format method if it exists:
> data.table(a = 1, b = list(mtcars))
       a                   b
   <num>              <list>
1:     1 <data.frame[32x11]>
> data.table(a = 1, b = list_of(mtcars))
# omitted: formats whole `mtcars` into a single string

Since we allow list columns of any class without applying as.data.(frame|table) first, we might as well format all lists, even classed ones, in a special compact form, despite a few methods, such as sf:::format.sfc, exist to do the right thing. And the sf classes still do print specially because data.table:::format_list_item.default still checks for a format method and eventually calls sf:::format.sfg.

We could also register format_col.sfc and set Enhances: sf. No need for conditional registration here since we own the generic.

@codecov
Copy link

codecov bot commented Mar 5, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.50%. Comparing base (41152fd) to head (7253358).
Report is 1 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6637      +/-   ##
==========================================
- Coverage   98.50%   98.50%   -0.01%     
==========================================
  Files          79       79              
  Lines       14761    14759       -2     
==========================================
- Hits        14540    14538       -2     
  Misses        221      221              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@MichaelChirico
Copy link
Member Author

Thanks Ivan! For future reference, I think you omitted your format.foo2130 method. Also I'm not sure which package's DataFrame that is, is it maybe {S4Vectors}?

@MichaelChirico MichaelChirico merged commit 96c3e6a into master Jul 1, 2025
12 checks passed
@MichaelChirico MichaelChirico deleted the vctrs-print branch July 1, 2025 16:39
@aitap
Copy link
Member

aitap commented Jul 2, 2025

Correct, that's the S4 class from Bioconductor. I must have used registerS3method("format", "foo2130", function(x, ...) rep("All hail foo",length(x))) from the previous comment, but I wouldn't have to remember it if I had copied it in my message.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

list sub-class with format() method prints full contents

2 participants